-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
admin: Support /drain_listeners?graceful #11639
Conversation
Signed-off-by: Auni Ahsan <[email protected]>
|
This this makes sense to me. You will want to add docs here: /wait |
Signed-off-by: Auni Ahsan <[email protected]>
Added docs, not sure if more detail is needed. I think the drain docs probably need an overhaul, but I don't really have the context with other parts of it to want to take that on. |
Signed-off-by: Auni Ahsan <[email protected]>
Signed-off-by: Auni Ahsan <[email protected]>
Signed-off-by: Auni Ahsan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with small comments.
/wait
docs/root/operations/admin.rst
Outdated
.. http:post:: /drain_listeners?graceful | ||
|
||
When draining listeners, enter a graceful drain period prior to closing listeners. | ||
This behaviour and duration is determined by server options for the drain manager, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ref link to the relevant config and CLI options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't figure out how to ref link to the server options, not sure if the ServerInfo proto is what you mean, but I linked the CLI options.
@@ -12,6 +12,8 @@ various events. Draining occurs at the following times: | |||
* The server is being :ref:`hot restarted <arch_overview_hot_restart>`. | |||
* Individual listeners are being modified or removed via :ref:`LDS | |||
<arch_overview_dynamic_config_lds>`. | |||
* The server begins shutdown sequence via the :ref:`drain_listeners?graceful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 39 below needs updating. Please take a general pass through this page. You have as good an understanding of this as any, and we expect contributors to care for the docs and improve them as they make changes. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I'm kind of rushing for a deadline and writing docs in markdown was a little confusing. Reorganized a lot of this doc and added more detail.
Signed-off-by: Auni Ahsan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks LGTM with small comments.
/wait
In a few different scenarios, Envoy will attempt to gracefully shed connections. For instance, | ||
during server shutdown, existing requests can be discouraged and listeners set to stop accepting, | ||
to reduce the number of open connections when the server shuts down. Draining behaviour is defined | ||
by the server options in addition to indvidiaul listener configs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo indvidiaul
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
* Individual listeners are being modified or removed via :ref:`LDS | ||
<arch_overview_dynamic_config_lds>`. | ||
|
||
By default, the Envoy server will close listeners immediately on server shutdown. To close listeners |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, the Envoy server will close listeners immediately on server shutdown. To close listeners | |
By default, the Envoy server will close listeners immediately on server shutdown. To drain listeners |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
:ref:`Redis <config_network_filters_redis_proxy>`, and | ||
:ref:`Mongo <config_network_filters_mongo_proxy>`. | ||
:ref:`HTTP connection manager <config_http_conn_man>`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mixed up commas, period, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done I think. markdown is hard to read.
|
||
const bool graceful = params.find("graceful") != params.end(); | ||
if (graceful) { | ||
server_.drainManager().startDrainSequence([this, stop_listeners_type]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is called twice does something bad happen? Does this need to be guarded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh good catch. Added a test, added a guard, left a comment because I'm not sure what the admin response should return if you've called /drain_listeners?graceful twice. Should it silently do nothing? (Since the server's already draining...) Or return an error on the admin call? (If so what error?)
Signed-off-by: Auni Ahsan <[email protected]>
Signed-off-by: Auni Ahsan <[email protected]>
Signed-off-by: Auni Ahsan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outstanding item on what to return from the admin endpoint in case of double calls to startDrainSequence()
In a few different scenarios, Envoy will attempt to gracefully shed connections. For instance, | ||
during server shutdown, existing requests can be discouraged and listeners set to stop accepting, | ||
to reduce the number of open connections when the server shuts down. Draining behaviour is defined | ||
by the server options in addition to indvidiaul listener configs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
* Individual listeners are being modified or removed via :ref:`LDS | ||
<arch_overview_dynamic_config_lds>`. | ||
|
||
By default, the Envoy server will close listeners immediately on server shutdown. To close listeners |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
:ref:`Redis <config_network_filters_redis_proxy>`, and | ||
:ref:`Mongo <config_network_filters_mongo_proxy>`. | ||
:ref:`HTTP connection manager <config_http_conn_man>`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done I think. markdown is hard to read.
|
||
const bool graceful = params.find("graceful") != params.end(); | ||
if (graceful) { | ||
server_.drainManager().startDrainSequence([this, stop_listeners_type]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh good catch. Added a test, added a guard, left a comment because I'm not sure what the admin response should return if you've called /drain_listeners?graceful twice. Should it silently do nothing? (Since the server's already draining...) Or return an error on the admin call? (If so what error?)
|
||
const bool graceful = params.find("graceful") != params.end(); | ||
if (graceful) { | ||
if (server_.drainManager().drainClose()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this always going to return true depending on the settings? I think you need an actual "drain sequence started" boolean?
In terms of what to do, I don't feel strongly about it as long as it's documented well. I think returning 200 OK is fine with me.
/wait
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah actually this is going to flakily break on probabilistic drain. The underlying draining_
variable already exists, do you think I should just add a getter to the drain manager interface? Hopefully won't need to change the parent DrainDecision class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think a getter on the base interface is fine which would just say whether the drain sequence has started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Signed-off-by: Auni Ahsan <[email protected]>
Signed-off-by: Auni Ahsan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than missing comment, thanks.
/wait
@@ -21,6 +21,8 @@ class DrainManager : public Network::DrainDecision { | |||
*/ | |||
virtual void startDrainSequence(std::function<void()> drain_complete_cb) PURE; | |||
|
|||
virtual bool draining() const PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add doc comments for the interface method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks for the quick turnaround.
Signed-off-by: Auni Ahsan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Mac OS X CI failure seemed unrelated (looks like it was just hanging, CC @asraa as on-call). |
Calling /drain_listeners?graceful will trigger the drain manager drain sequence prior to closing listeners. Risk Level: Low. Testing: Tested that connections are terminated on request complete during the graceful drain period, that new connections can still be opened, H1/H2-specific response behaviour. Docs Changes: Add docs to admin.rst, improve the overall drain sequence documentation. Signed-off-by: Auni Ahsan <[email protected]>
Calling /drain_listeners?graceful will trigger the drain manager drain sequence prior to closing listeners. Risk Level: Low. Testing: Tested that connections are terminated on request complete during the graceful drain period, that new connections can still be opened, H1/H2-specific response behaviour. Docs Changes: Add docs to admin.rst, improve the overall drain sequence documentation. Signed-off-by: Auni Ahsan <[email protected]> Signed-off-by: yashwant121 <[email protected]>
Signed-off-by: Auni Ahsan [email protected]
Calling /drain_listeners?graceful will trigger the drain manager drain sequence prior to closing listeners.
Risk Level: Low.
Testing: Tested that connections are terminated on request complete during the graceful drain period, that new connections can still be opened, H1/H2-specific response behaviour.
Docs Changes: Add docs to admin.rst, improve the overall drain sequence documentation.